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Abstract. The use of formal analysis tools on models or source code often re- 
quires the availability of auxiliary invariants about the studied system. Abstract 
interpretation is currently one of the best approaches to discover useful invariants, 
especially numerical ones. However, its application is limited by two orthogonal 
issues: (i) developing an abstract interpretation is often non-trivial; each trans- 
fer function of the system has to be represented at the abstract level, depending 
on the abstract domain used; (ii) with precise but costly abstract domains, the 
information computed by the abstract interpreter can be used only once a post 
fix point has been reached; something that may take a long time for very large 
system analysis or with delayed widening to improve precision. This paper pro- 
poses a new, completely automatic, method to build abstract interpreters. One of 
its nice features is that its produced interpreters can provide sound invariants of 
the analyzed system before reaching the end of the post fix point computation, 
and so act as on-the-fiy invariant generators. 



1 Introduction and Motivation 

Theoretical frameworks such as abstract interpretation and symbolic (specifically, logic- 
based) model checking have led in the last few years to the development of analysis 
tools that are starting to have a strong practical impact on the development of real word 
software, in particular for safety- or mission-critical systems. Interestingly, current ab- 
stract interpretation and model checking techniques exhibit complementary strengths 
and weaknesses. Model checking techniques so far have been stronger on software 
that is mostly control-driven and not heavily data-dependent. To be effective with data- 
dependent programs, these techniques may require programs to be judiciously anno- 
tated with data invariants. Also, model checking has been traditionally limited to finite- 
state systems, although new approaches relying on solvers for Satisfiability Modulo 
Theories (SMT) are starting to remove that limitation. 

Dually, abstract interpretation techniques are quite effective on data-dependent pro- 
grams, in particular numerical ones, requiring in principle no program annotations. On 
the other hand, they have more difficulties in dealing with control aspects. Also, al- 
though abstract interpretation is a very general framework, most of its applications fo- 
cus on the analysis of source code. Even tools, such as Nbac [13], that target software 
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artifacts at a higher level of abstraction (e.g., software models expressed in dataflow 
specification languages) do not analyze those artifacts directly and work instead with 
their compilation into C code. This is possibly a consequence of the fact that developing 
an abstract interpreter for a complete language can be time consuming: even if a large 
set of abstract domains, such as those provided by the APRON library [14], is readily 
available, it requires substantial work to define sound abstract transformers for every 
construct of the target language. Another limitation of current abstract interpretation 
techniques is that they typically rely on Kleene-style fix point algorithms to construct 
an abstract semantics of the program under analysis. The properties of such semantics, 
characterized by the concretization of a post fix point of an abstract trasformer, can be 
obtained only once the post fix point has been (completely) computed. Depending on 
the widening strategies used or, in general, the complexity of the abstractions and the 
semantics considered, one may have to wait a long time to get any information at all 
from the analysis of the program. 

Contributions. In this work we try to address some of the issues above by combining 
techniques from abstract interpretation and logic-based model checking. Specifically, 
we propose a general method for the automatic definition of abstract interpreters that 
compute numerical invariants of transition systems. We rely on the possibility of encod- 
ing the transition system in a decidable logic — such as those typically used by SMT- 
based model checkers — to compute transformers for an abstract interpreter completely 
automatically. Our method has the significant added benefit that the abstract interpreter 
can be instrumented to generate system invariants on the fly, during its iterative com- 
putation of a post fix point. A prototype implementation of the method provides initial 
evidence of the feasibility of our approach and the usefulness of its incremental invari- 
ant generation feature. 

Significance. While motivated by practical issues (namely, the generation of auxiliary 
invariants for a fc-induction model checker) the current work is more general and can be 
adapted to a wide variety of contexts. It only requires that the transition system seman- 
tics be expressible in a decidable logics with an efficient solver, such as SAT or SMT 
solvers, and that the elements of the chosen abstract domain be effectively representable 
in that logic (as discussed later in more detail). Such requirements are satisfied by a large 
number of abstract domains used in current practice. As a consequence, we believe that 
our approach could help considerably in expanding the reach of abstract interpretation 
techniques to a variety of target languages, as well as facilitate their integration with 
complementary techniques such as model checking ones. 

Related work. With the current efficiency of SMT solvers on the one hand and the 
ability of abstract interpretation to compute numerical invariants on the other, the issue 
of combining SMT and AI is receiving increased attention. In [7], Cousot, Cousot and 
Mauborgne draw a parallel between SMT-based reasoning and abstract interpretation. 
They identify the Nelson-Oppen procedure as a reduced product over different inter- 
pretations. While this work is more general, it allows one to understand ours as follow: 
the concrete domain presented in Figure 2 is an abstract logical domain, our concrete 
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transformer — computed with the aid of an SMT solver — can be understood as an over- 
approximation of the concrete transition relation in this abstract logical domain. The 
abstraction we built amounts to compute a reduction between a logical and an algebraic 
domain, as suggested in [7, §6]. Comparable work in [22], gives an overview of tech- 
niques embedding logical predicates as elements of logical lattices. Some SMT theories 
are then formalized within this abstract interpretation view of the analysis: uninterpreted 
function symbols, linear arithmetic, and their combination. 

Another more practical approach, by Monniaux and Gonnord [20], uses bounded 
reachability with an SMT solver to compute a chaotic iteration strategy. The solver iden- 
tifies the equation that needs propagating in order to achieve a better widening. How- 
ever, unlike ours, this solution does not rely on the actual (counter-)models synthesized 
by the SMT solver. In [10], an SMT solver is used to choose among different strate- 
gies in an iteration-based policy analysis. The solver identifies the next strategy that 
will improve the current abstract property. The latter two works rely on SMT solvers to 
help the fix point computation but do not rely on the SMT-based concrete semantics to 
compute the abstract property. 

Another line of work addresses the embedding of abstract interpretation into log- 
ical frameworks. In [11], the authors proposes an abstract domain with quantification 
over a specific pattern of properties. They provide generic transfer functions and lattice 
operators that enable the representation of properties like /\ (P, => Q,). 

Also related is Monniaux's automatic modular abstraction for linear constraints [19]. 
A predicate transformer is defined using quantifier elimination over the semantics of C 
statements, as in an axiomatic semantics (weakest precondition or strongest postcon- 
dition). The transformer is exact for the linear template abstractions considered. It is 
however not clear how this approach can scale to a complete program analysis, since 
the use of quantifier elimination on a complete transition system is not usually feasible. 
In [19] the analyzed blocks are small functions used in a symbol library for Lustre/S- 
cade. 

2 Formal Preliminaries 

We rely on basic notions and results from abstract interpretation [3,4,5, e.g.]. We in- 
troduce below those that are most relevant to this work, to have a more self-contained 
presentation. Similarly, we also introduce relevant notions from symbolic logic and au- 
tomated reasoning. 

As customary, we will model computational systems as transition systems. A tran- 
sition system S is a triple (Q, I, ~~») where Q is a set of states, the state space; I c Q is 
the set of S 's initial states; and c Q x Q is S 's transition relation. A state q e Q is 
reachable if q € I or q' q for some reachable state q' . 

Abstract Interpretation Abstract interpretation allows one to analyze a transition sys- 
tem S - (Q,I, ~») by first defining a concrete domain for S, a partially ordered set 
(D, c), and a concrete transformer, a monotonic function / : D — > D. In this paper we 
will focus on the collecting semantics 

s d = lfpftf) 
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of S where D = p(Q), Q is set inclusion, f(X) = X U fx' | x 6 X, x x'} and lfpf (/) is 
the least-fix point of / greater than I, obtained as the stationary limit of the ascending 
sequence Xq C X\ c . . . with Xo = I and X„ = f(X n -\) for all n > 0. However, our work 
could be extended to other semantics, such as trace semantics. 

The second step in abstract interpretation consists in providing an abstract repre- 
sentation of the chosen concrete domain, the abstract domain, given by another partial 
order (D # , C # ) (typically a complete partial order if (D, C) is one). The two domains 
are related by an abstraction function a : D i-> D # and a concretization function 
y : D* i-» D that respectively associate an abstract element, a member of D # , to each 
concrete element, a member of D, and vice versa. We call an abstract transformer any 
monotonic function g : D* — > D # . A good abstraction function is closed under intersec- 
tion, to ensure the existence of the best abstraction for each concrete element. Galois 
connection-based abstractions satisfy this desideratum. 

Definition 1 (Galois connection). Two functions a : D — > D # and y : D # — > D 

form a Galois connection between two lattices (D, C) and (D # , C), which we denote 
by a : (D, c) ±^ (D # , C) : y, if (i) both a and y are monotonic; (ii) for all y 6 D # , 
a (y(y)) C y; and (Hi) for all x e D, x c y(a(x)). 

We will rely on the following important property of Galois connections. 

Proposition 1 (Unique adjoint in a Galois connection [4]). If a : {D, c) ±5 (D # , c 
) : y then (i) for all x E D, a(x) = \~\{y\xQ y(y) }; (ii) for all y € D # , y(y) = 
U {x \a(x) C y }; where \~~\ and (J denote respectively the greatest lower bound and the 
lowest upper bound operators in the two lattices. 

In a Galois connection, abstract transformers can be related to concrete ones ac- 
cording to the following notion of sound approximation. 

Definition 2. If a : (D, C) ±5 (D # , C) : y, an abstract transformer f # : D # -* D* 
is a sound approximation of a concrete transformer f : D —* D if for all x 6 D, 
(a o f)(x) C (J* o a)(x) or, equivalently, for all y 6 D # , (f o y)(y) C (y o f # )(y). 

Abstract transformers in the function space (D # —* D # ) are partially ordered by the 
point-wise extension of C, which we denote also by C. In Galois connections, the set of 
sound approximations of a concrete transformer has a smallest element wrt C. 

Proposition 2 (Best sound approximation [4]). If a : (D, c) ±3 (D # ,C) : y, an ab- 
stract transformer f # :£)*—>£)* is a sound approximation, wrt this Galois connection, 
of a concrete transformer f : D — > D iff a o / o y C. f # . 

The property above implies that a o / o y is the best abstract transformer for /, in the 
sense of being its tightest sound approximation. 

First-order logic. Our method works with several logics that can be more or less di- 
rectly embedded in many-sorted first-order logic with equality [9, 1 8] (including prepo- 
sitional logic and quantified Boolean logic). For generality then, we present our work 
in terms of that logic. We fix an infinite set S of sort symbols. For each <x e S, we also 
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fix an infinite set X^- of variables (of sort cr), with X (ri disjoint from for all distinct 
0"i, o"2 e S, and let X = IJo-es Xr. A many-sorted signature E consists of a set 2" s c 5 of 
sort symbols, a set2* p of (sorted) predicate symbols p ay " cr ", a setIT F of (sorted) function 
symbols, y* r i" -cr » ' > where n > and o"i, . . ., <r„,(T e 2* s . We drop the sort superscript 
from function or predicate symbols when it is clear from context or unimportant. 

For each cr e E s , a (E-)term of sort cr is either a variable x e X^ or an expression 
of the form f <Tv " crn<T {h, . . . , t„) where y°T" er » tr g Z ¥ and t\ is a term of sort <x ; for / = 
1, . . . ,n. An atomic (E-)formula is an expression of the form t\ - ti where t\ and ti 
are terms of the same sort, 3 or one of the form p°~' '^"(fi, . . . , t„) with n > where 
pO-i - a-,, g 2" p and f, is a i^-term of sort <x ; for / = 1, . . . ,n. Non-atomic formulas with 
the usual Boolean connectives (false, V, A, =>, . . .) and quantifiers (V, 3) are defined 
as expected. Free and bound occurrences of a variable in a formula are also defined 
as usual. If F is a Z'-formula and (x\, . . . ,x n ) a tuple of distinct variables, we write 
F[xi, . . ., x„] to express that the free variables of F are in (jci, . . ., x„); furthermore, if 
t\, . . . ,t n are terms with each f; of the same sort as x,-, we write F\t\, . . . , t„] to denote 
the formula obtained from F[x\, . . . , x n ] by simultaneously replacing each occurrence 
of Xi in F by f,, for i = 1 , . . . , k. We denote finite tuples of elements by letters in bold 
font, and use comma (,) for tuple concatenation. 

For each signature E, a E -interpretation Ai is a mathematical structure that maps: 
each cr e E s to a non-empty set Ma-, the domain of cr in Ai; each x e X of sort cr to an 
element x M e Ma-; each f Ty " a » <T e E F to a total function : M ai x • • • x -> Ma- 
(and in particular each constant c of sort cr to an element c M e M a ); each p°" 1 '" cr " e i; p 
to a set c Mo-, x ■ • • x M^. 

Every ^-interpretation Ai over some X c X induces a unique mapping (_)•"' 
from i7-terms fit \ ,...,?„) with variables in X to elements of sort domains such that 
. . . , t n )) M = f M (t™, t™). A satisfiability relation |= between such interpre- 
tations and i7-formulas with variables in X can defined inductively as usual. A E- 
interpretation Ai satisfies a 2"-formula F if Ai \= F. A ^-formula F is satisfiable if 
it is satisfied by some i7-interpretation. A set r of i7-formulas is satisfiable if there is a 
iT-interpretation that satisfies every formula in 7". 

We are not generally interested in arbitrary formulas and interpretations but in spe- 
cific sets of Z"-formulas and specific classes of 2"-interpretations, for some signature E. 
We collect these restrictions in the notion of a (sub)logic (of many-sorted logic). More 
precisely, a logic is a triple X = (E, F, M) where E is a signature; F, the language of 
X, is a set of ^-formulas; and M is a class of ^-interpretations, the models of X, that is 
closed under variable reassignment, i.e., Ai[x i-» a] e M for all Ai e M, all variables x 
of sort cr and all a e M a , where Ai[x i-» a] is the 17-interpretation that maps x to a and 
is otherwise identical to At. A formula F[x] of X is satisfiable (resp., unsatisfiable) in 
X if it is satisfied by some (resp., no) interpretation in M. A set F of formulas entails 
in X a IT-formula F, written r |=£ F, if F U {F } e F and every interpretation in M that 
satisfies all formulas in F satisfies F as well. The set F is satisfiable in X if F false. 
Two formulas F and G are equivalent in X if F |=£ G and G Kc F. 4 

3 We will use = also to denote equality at the meta-level, relying on context to disambiguate. 

4 All these notions reduce to the corresponding standard ones in many-sorted logic when F is 
the set of all i7-formulas and M the class of all 27-interpretations. 
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3 Automatic definition of a computable abstract transformer via 
an encoding in a decidable logic 

For the rest of the paper we fix a transition system S = (Q, I, ~>) and its collecting 
semantics S = lfpj (/) introduced earlier, which coincides with the set of reachable 
states of S . Our main concern will be how to define an abstract counterpart fx of / in 
a suitable Galois connection a : (p(Q), Q) ^ (A, C) A : y so that we can define S 's 
abstract semantics as 

S* d = lfp= A (/ A ) 

*A 

where Ia is in turn a suitable abstraction of I. By well-known results by Kleene and 
Cousot and Cousot [3,4], the fix point above can be computed or over-approximated so 
that its concretization by y is a sound approximation of the concrete fix point S. 

A main issue when using abstract interpretation in general is to how define /a. In 
practice, when the transition system is generated, as is often the case, by a program in 
a certain programming language, the concrete transformer / is defined constructively 
in terms of the language's idioms (e.g., assignment, loop and conditional statements for 
imperative languages) and memory model (e.g., heap, stack, etc.). The corresponding 
abstract transformer must then handle all those those constructs as well, and reflect their 
respective actions in the abstract domain 

When the abstraction is defined via the unique adjoint property of the Galois con- 
nection, the definition of /a is usually a manual, laborious chore. One has to design the 
transformer in detail and then prove it sound, by showing that f(X) e y(f\{a)) for all 
a e A and X e y(a). We present a method that, under the right conditions, can instead 
compute a sound abstraction of / completely automatically. The method is applicable 
when the transition system and the concrete and abstract domains can be encoded, as 
explained later, in a logic X satisfying a number of requirements. For generality, we 
will describe our method in terms of an arbitrary logic X satisfying those requirements. 
To have an idea, however, depending on the concrete domain, possible examples of X 
would be propositional logic or several of the many logics used in SMT: linear real 
arithmetic, linear integer arithmetic with arrays, and so on. 

Logic requirements. We assume a logic X = (£, F, M) with a decidable entailment 
relation f=_£ and a language F closed under all the Boolean operators. 5 For each sort cr 
in X, we distinguish a set V lT of variable-free terms, which we call values, such that 
Nx - >(Vi = Vz) for each distinct Vi,V2 e V a . Examples of values would be integer 
constants, selected terms of the form n/m where n is an integer constant and m a non- 
zero numeral, and so on. We assume that the satisfiable formulas of X are satisfied by 
values, that is, for every formula F[y] (with free variables from y) satisfiable in a model 
At of X there is a value tuple v such that F [v] is satisfiable in M. 

We assume a total surjective encoding of S '& state space Q to n-tuples of values in 
the sense above, for some fixed n (where each n-tuple encodes a state). Depending on X, 
states may be encoded, for instance, as tuples of Boolean constants or integer constants, 
or mixed tuples of Boolean, integer and rational constants, and so on. Because of this 



5 The latter is mostly to simplify the exposition. Weaker assumptions are possible. 
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encoding, from now on we will identify states with tuples of values. Note that, thanks to 
our various assumptions, each formula F[y\, . . . ,y{\ in k-n variables denotes a subset of 
Q k , namely the set of all fe-tuples of states that satisfy F. We call that set the extension 
of F and define it formally as follows: 

PI = f {(v 1 ,...,v Jt )eQ*|F[vi,...,v Jt ]issatisfiablein£}. 

We will refer to formulas like F above as state formulas and say they are satisfied by the 
state sequences in [FJ. For each state v = (vi, . . . , v„) € Q and tuple x = (x\, . . . , x n ) 
of distinct variables of corresponding sort, we denote by A v the assignment formula 
x\ — Vi A • ■ • A x„ = v„, which is satisfied exactly by v. 

Finally, we assume the existence of an encoding ofS in X, a pair of formulas of £, 

(/[*], T[x,x']) 

with x and x' both of size n, where I[x] is a formula satisfied exactly by the initial states 
of S , and T[x, x'] is a formula satisfied by two reachable states v, V iff v ~> v'. 

First abstraction: from sets of states to formulas of We start with an intermediate 
abstraction that maps sets of states to formulas representing those states. To do that, we 
extend the language of X. by closing it under a disjunction operator V that applies to 
(possibly infinite) sets of formulas of £,. We then extend the notions of satisfiability, 
entailment and equivalence in X. to the new language as expected (e.g., for every set F 
of formulas of V F is satisfiable in an interpretation At if some F e F is satisfiable 
in At, and so on). 6 

Let F x be the set of all formulas in the extended language above whose free vari- 
ables are from the same n-tuple x. One can show that mutual entailment between two 
formulas in F x is an equivalence relation. Let [F] denote the equivalence class of a for- 
mula F with respect to this relation, and let E denote the set of all those equivalence 

def 

classes. Let [[[F]]] — 1^1 for eacn [F] e E. The poset (E, Ce) where 

[F] Ce [G] iff F G 

def 

has a lattice structure with the following join and meet operators: [F] Ue [G] = [F V G] 
and [F] I~Ie [G] d = f [F AG]. It can be shown that the two functions 7 

a E :^(Q)^E d = AV. [\f{A v | v e V}] 

y E :E^p(Q) ^ AE. [FJ 

form a Galois connection. According to Proposition 2, the best sound abstract trans- 
former of / wit this connection is 

/e : E -» E d = a E o / o y E = AE. [\f {A v | v e [F] U [u' \ u e [FJ, u «'}}] 

6 This is just for theoretical convenience. In practice, our method will never work with for- 
mulas V r where F is infinite. 

7 We borrow /1-calculus' notation to denote mathematical functions. 
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By our assumptions and the definition of or E , the most precise abstraction of I is or E (I) = 
[7]. 8 It follows that in the abstract domain (E, C E ) we can define the following semantics 

for S: § E = f lfp^(/ E ). 

Second abstraction: changing fix point computation. For our later needs, we would 
like to have a fix point computation that actually enumerates the additional states dis- 
covered by the collecting semantics. The abstraction above, over-approximating sets of 
states by assignment formulas, is not well suited for that. Hence, we introduce another 
abstract transformer, on the same lattice (E, c E ): 

g E : E -» E = f AE. E U E C({[A V> ] | T[v,v'] is sat. in £, v e IE], v' i [£]}) 

where C is some choice function over subsets of E, returning one element of its input 
set if the set is non-empty, and [false] otherwise. This function maps each equivalence 
class E to one whose extension increases [[£] with just one state, chosen among the 
successors of the states in ^EJ according to the transition formula T. We can use g E 
instead of f E in the fix point computation thanks to the following result. 

Proposition 3 (Soundness). The transformers f E and g E have the same least-fix point 
greater than [I], that is, rfpj^C/k) = lfp^(gE) ■ 

Proof. Let us show that lfp^(/ E ) C E lfp^fos) Ce lfp„ E (/ E ). For the first, we have 

def 

to show that for each element x e E, we can build a C E -increasing chain X = 
x, gE(x), g E (x), ... such that its lub U E X is greater or equal to / E (x). Computing / E (x)A-a 
characterizes the formula describing new states, reachable in a single transition from x. 
The increasing chain X is built by enumerating those elements as the one produced by 
the choice function C in the successive application of g E . 

For the second constraint, as the new element produced by the choice function C is 
choosen among the new state reachable in a single transition, we have for all element 
x€~E,g E (x) c/ E (x). □ 



Main abstraction: abstracting formulas in F x . We now introduce our last abstraction, 
mapping formulas in to elements of an abstract domain (A, Ca) like those typically 
used in abstract interpretation tools (such as intervals, polyhedra, and so on). 

We assume that A is fitted with a lattice structure with meet i~Ia and join Ua- We also 
assume the existence of a computable monotonic concretization function y E : A — > F^ 
which associates a formula of F x to each element of A. Intuitively, we are requiring that 
each element of A be effectively representable as a formula, which in turn denotes a set 
of states. This requirement is easily satisfied for many numerical abstract domains and 
the sort of logics used in SMT. For instance, intervals can be mapped to conjunctions 
of inequalities between variables and values; similarly, any linear-based abstraction can 
be mapped to a conjunction of linear arithmetic constraints. 

8 Recall that I is the set of initial states of 5 while / is the formula denoting I in £. 



Invariant Stream Generators using Automatic Abstract Transformers 9 



Input: a e A 

Fix,*] := y F (a)[x] A T[x,x'] A -y F (a)[*'] 
if F is not satisfiable in _£ then 

return a 
else 

let v,v' be two states that satisfy F[x,x'] 
return a U A aQ(v') 

Fig. 1. Automatic abstract transformer g A . 



With y:AnE = (AF . [F]) o y F we obtain the Galois connection 

a y : <E,C E > (A,C A > : y 

where, by Proposition 1, a y is uniquely determined by y. 

Finally, we assume the existence of a state abstraction function q-q : Q i-> A which 
directly associates states to their abstract counterparts in A but is such that a y ([A v ]) C A 
aQ{v) for each v € Q. In other words, a y is at least as precise as aq when abstracting 
formulas satisfied by exactly one state. 



The abstract transformer. Our main idea is to derive automatically a sound abstract 
transformer gA for gE by relying on the concretization function y, the state abstraction 
q-q, and a sound, complete and terminating satisfiability solver for the logic X- We 
require that for each satisfiable state formula F[x\, . . . , x{] the solver is able to return a 
state sequence vi, . . . , Vk satisfying F. 

The computation of the image of an abstract element a e A under g& is described in 
Figure 1. Figure 2 motivates its soundness. The satisfiability tests and the choice of the 
states v and v' in the figure are performed by the solver for £, — which then plays for gA 
the role of the choice function in the definition of gE- We point out that, while the fix 
point is usually computed in the abstract with the g\ function, with our approach it is 
not necessary to transfer back the element a e A to detect the post fix point: we know 
that g\ has reached that point when the formula F in Figure 1 is unsatisfiable. 




Fig. 2. Abstract transformer computation. 



10 



Pierre-Loi'c Garoche, Temesghen Kahsai and Cesare Tinelli 



1a :=± 

while (there is a state v satisfying I[x] A ^TyUa)Ix]) do 

I a ■= I a u A a Q (v) 
return / A 

Fig. 3. Initial states over-approximation. J_ is the bottom element of A 

To prove the soundness of our abstract transformer, we rely on the join-completeness 
property of Galois connections. 

Proposition 4 (Join-completeness for a). If a : (A, c) ±+ (B, c) : y, then a is join- 
complete, that is, a(Uxex x ) = L\xex a ( x )- 

Theorem 1 (Soundess). The abstract transformer g& is a sound approximation ofgE- 

Proof. Figure 2 summarize the following proof elements. Let us first consider the best 
abstract transformer g\ b with respect to the Galois connection: for all a e A,g Ah (a) = 
a y (gE(y(a))): gAb( a ) = <*y (y(a) u e lAv]) where [A v >] is defined as the equivalence 
class associated with the new state v' produced by the choice function. We remark that 
the state v' e Q in the definition of g\ is an arbitrary new state satisfying F. We now 
have to prove that the transformer g& computed by our procedure is sound with respect 
to the best transformer, i.e., for all a e A,g\ b (a) Ca gA(a)- 

Let v' e Q be the new state generated by the solver. Then, gA_(a) = a Ua o'q(v')- 
Using Property 4 on the Galois connection (a y ,y), we have that a y (y(a) U E [A„]) = 
a y o y(a) Ua a r ([A„]). By reductivity of a y o y and soundness of ckq with respect to a y , 
we have both a y o y(a) Ca a and a y ([A H ]) C Q'q(m). It follows that a y o y(a) Ua a y (u) C 
a U A a Q (u) and gA h ( a ) Ca gA(a)- □ 

Our eventual goal is to compute or approximate the fix point lfp}° A A (,gA) where I a 
is a sound over-approximation of the initial state formula or, more precisely, where 
[/] c E y(I A ). Depending on the formula / and the abstract domain A, computing I\ 
directly from [/] may not be feasible. In that case, we rely on the logic solver again to 
approximate /a- A basic algorithm for doing that is described in Figure 3. In practice, a 
widening operator V will be used in lieu of the simple join Ua to ensure convergence. 

Theorem 2 (Soundess). The element returned by the algorithm in Figure 3 is a 
sound approximation of [I]. 

Proof. The initial states over-approximation /a is defined as a fix point over the mono- 
tonic function adding over-approximations - through ckq - of new reachable states. 
Using Tarski's theorem, such a fix point exists. Let us consider an initial state i e Q and 
its associate equivalence class [A,] such that is [A,-] 7(7a)- Then, by definition of y 
and [•], i is not represented by the formula 7f(^a) and A ->yF(^A)[i] is satisfiable. 
Then I\ is not a fix point. U 

Figure 4 summarizes our overall framework: the analysis is computed in a tradi- 
tional abstract domain A but using a logical solver to supply a sound abstract trans- 
former on A. 
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Fig. 4. Global framework: combination of abstractions. 

4 On-the-fly invariant generation 

A formula F [x] is an invariant for S if [F] includes the set R s of all reachable 
states of S . Invariants have many useful applications in statical analysis, logic-based 
model checking, and deductive verification in general. In our abstract domain E from 
Section 3, any formula F such that lfpj^C/k) Ce [F] is an invariant, since Rs - 
Pfp^C/fc)]] £ [FJ. 9 By the construction of our abstraction in the domain A, any fix 
point computation for the transformer g\ : A — > A starting with the abstract element I\ 
computed by the algorithm in Figure 3 produces a value a such that yvifl) is an invariant 
for S. 

An notable feature of our approach is that, in practice, we can modify the fix point 
computation forgA to generate intermediate invariants, so to speak, as it goes and before 
reaching the fix point. We capitalize on the fact that ~yv(a) is typically a conjunction 
of formulas, or properties, Pi,... ,P m . For any intermediate value a € A constructed 
during the fix point computation for g\, if yv{a) — Pi A • ■ • A P m we can check whether 
any of the P"s is already invariant. This can be done, for instance, by checking that P' 
is inductive or, more generally, k-inductive [21], using the solver for £. We discuss an 
efficient mechanism for doing that for multiple properties at the same time in previous 
work [15]. Here we point out that such a mechanism can be used in our approach to 
turn the abstract interpreter for A into an invariant stream generator. 

The invariants generated in the earlier iterations of the interpreter are usually the 
simplest ones, e.g., bounds on a variable, and become increasingly more elaborate as 
the computation proceeds. The main point is that one does not need to wait until the end 
of a possibly complex fix point computation (using a wide sets of costly abstractions) 
to obtain simple invariants such as interval bounds for variables, equalities between 
variables and so on. Even better, the auxiliary invariants generated on the fly can be used 
to improve the preciseness of the very fix point computation that generated them. We 
can be do that by modifying the algorithm in Figure 1 to use the following strengthening 



9 Of course, obtaining a formula from the equivalence class lfp^C/k) would be enough for 
all purposes since that class consists of the strongest invariants for 5 . However, in general, such 
formulas may be infinitary or impractical to compute. 
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1 node parallel. counters (a,b,c:boo\) returns (x,y : int; obs:boo\); 

2 var m, n 2 : int ; 

3 let 

4 m = 10000; n 2 = 5000; 

5 x = -> if (b or c) then else 

6 if a and (pre x) < n x then (pre x) + 1 else pre x; 

7 y < = -> if c then else 

8 if a and (pre y) < n 2 then (pre y) + 1 else pre y; 

9 obs = (jc = ni) implies (y = n 2 )\ 
10 tel 

Fig. 5. Double counter example in Lustre. 

of the formula F defined there: 

y F (a) [x] A T[x, x'] A In[x'] A -ny F (a)[x'] (1) 

where, at each call of g\, the new subformula In is the conjunction of the auxiliary 
invariants generated until then. This increases the precision of g& while maintaining 
its soundness since it removes from consideration states that do not satisfy the current 
invariants (and so are necessarily unreachable). 

Example 1. Consider the simple transition formula T[x,x'] :- (G[x] => x' = -1) A 
(-iG[x] => x 1 — x + 1) in a logic of integer arithmetic, for some G. Then suppose the 
current result of the fix point computation for g& is a value a with jv{a) = x > A P[x] 
for some sub-property P. Suppose also that the sub-property x > has been identified 
as invariant. As defined in Figure 1, the computation of g\{a) could very well produce 
two states n, n' for x, x' with n' negative. Since x > is invariant, both of these states 
are in fact unreachable. Using (1) will rule out that pair of states. □ 

For many of the common domains that we can use for A, the current value a of the 
fix point computation is actually expressed as a meet a\ I~Ia • • • I~Ia a^ of other elements. 
Moreover, yp is meet-complete, i.e., defined so that 7f(«i I~Ia • • • I~Ia a^) = 7f(«i) A • • • A 
7F(flt)- This means that the invariant In in (1) can be traced back to an i e A such that 
7f(0 = In. With this in mind, one can then understand the strengthened definition of 

def 

gA as inducing a reduced domain A' by the closure operator p : A — > A' = Aa. a I~Ia i 
(c.f. [7, §6]). Since the widening operators used in fix point computations are in general 
non-monotonic, enforcing invariants while using widening is helpful in reducing the 
loss of precision caused by widening. 

5 Application: invariant generation for Lustre programs 

This work was motivated by the problem of proving safety (i.e., invariant) properties 
of Lustre programs. Lustre [12] is a synchronous data-flow specification/programming 
language with infinite streams of values of three basic types: Booleans, integers, and 
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reals. It is used to model control software in embedded devices. Properties to be proved 
invariant are often introduced within Lustre programs as observer Boolean streams. 
Checking their invariance amounts to checking that their corresponding flow is con- 
stantly true. In previous work, we have developed a fc-induction-based parallel model 
checker, called Kind [17], which uses SMT solvers as its main reasoning engine. Kind 
benefits from the use of auxiliary invariant generators to strengthen its basic fc-induction 
procedure [16]. We have implemented the fix point computation method described here 
as an additional on-line invariant generator for Kind. 

Kind actually works with an idealized version of Lustre that treats Lustre numerical 
types as infinite-precision. Idealized Lustre programs can be readily recast as transition 
systems in a three-sorted concrete domain with Booleans, (mathematical) integers and 
reals. Such systems can be almost directly encoded and reasoned about in a quantifier- 
free logic of mixed integer and real arithmetic with uninterpreted function symbols. The 
linear fragment of that logic, which we could call QF_UFLIRA in the nomenclature of 
SMT-LIB [1], can be efficiently decided by most major SMT solvers. 10 

This means that Lustre programs limited to linear arithmetic are amenable to anal- 
ysis with our method. As abstract domain we use one defined, as usual, as a re- 
duced product of a varieties of abstract domains, including relational and non-relational 
ones — partitioning mechanisms allow our tool to express some non-linear properties. 
Our implementation of the function yp converts abstract elements into formulas of 
QFJJFLIRA as one would expect: an interval [a; b] for a variable x is converted into 
the formula a < xAx < b;a linear constraint 27, a, • x, > c is mapped directly to the cor- 
responding formula of QFJJFLIRA. The translation is extended homomorphically to 
more complex elements. For instance, elements that are the meet of other ones (such as 
polyhedra, etc.) are converted to the conjunction of the conversion of the components. 

Our implementation is written in OCaml, relies on the APRON abstract domain 
library [14], and shares with Kind, also written OCaml, modules to encode Lustre pro- 
grams as transition systems in the QFJJFLIRA logic and to interact with the SMT 
solver. 

Example 2. Let us illustrate the use of our invariant generator on a typical example: 
counters, which are use widely within safety mechanisms for critical systems. 11 In the 
Lustre program shown in Figure 5, two counters x and y are incremented up to their 
respective maximum value whenever the input value a is true; both are reset to when 
the input c is true. The counter x is reset also when the input b is true. Suppose we 
would like to prove that whenever x reaches its maximum value, so does y. This property 
is expressed by the synchronous observer obs. It is enough to show then that the Boolean 
stream obs is equal to the constant stream true. 

The invariant generator discovers without any special tuning the fact that x e 
[0; 10000] and y e [0;5000]. With additional partitioning parameters, it can in fact 
generate the target property itself: x = n\ => y = «2- Focusing on x and y, the only 
two stateful variables in the program, 12 Figure 6 shows the first four states enumerated 

10 That includes CVC3 and Yices [2,8], the SMT solvers used by Kind. 

11 The example and the tools described here can be found at 
http : //clc . cs . uiowa . edu/Kind/SAS12 . 

12 The Lustre expression (pre x) denotes the value of x in the preceding state. 
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The injected states are, in order, (0,0), (0, 1), (1, 1) and 
(2,2). After the injection of (1,1), the current abstract 
element is described by the dark triangle: < x < 1, 
< y < 1 and x < y. When using our partitioned analy- 
sis (with explicit partitioning), we also obtain properties 
under the implication: y < n 2 => ... At the fourth iter- 
ation, three sub-properties of the one expressed by the 
dark triangle are proven invariant: < x, < y and 
y < ii2 => x < y. Those invariants immediately com- 
municated to Kind's ^-induction engine, but also used in- 
ternally to constrain the following iterations. 

Fig. 6. First four steps of the fix point computation for the example. 




x x 
(a) First 7 iterations of the fix point computation (b) Final fix point 



Fig. 7. Iterations and final fix point for example. 

by our fix point algorithm and injected into the abstract domain. At the forth iteration 
of the computation the following properties are already identified as invariant: x > 0, 
y > and y < «2 => x < y. 

Figure 7 shows another intermediate element obtained before widening, as well as 
the final abstract element obtained, the fix point of the abstract collecting semantics. 
On this example, using ^-induction alone Kind is not able to prove in reasonable time 
the property expressed by obs. It principle it could, but since the property is 10000- 
inductive, ^-induction requires too many unrollings of the system's transition relation 
to scale. However, using the auxiliary invariant y < «2 ==> x < y produced at the 
forth iteration of our invariant generator, along with the bounds obtained without any 
partitioning, Kind is able to prove the target property instantaneously. 

6 Conclusion and further work 

The framework we presented offers two main contributions: (i) a systematic and auto- 
matic generation of abstract transformers relying on logic solvers and abstract domain 
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libraries; (ii) the generation of invariants during the computation of post fix points. Al- 
though this paper focused mainly on least fix point computations of a forward seman- 
tics, the approach can be applied in a wide range of settings: computing a greatest fix 
point or analyzing a backward semantics is directly expressible in the framework with- 
out major modifications. This approach is truly automatic whenever the target system 
can be encoded in a suitable decidable logic, and abstract domain elements are repre- 
sentable in that logic. Such conditions are often easy to satisfy for systems already an- 
alyzable with SMT solvers, and for numerous available abstract domains. Under those 
conditions one obtains an abstract interpreter for free. There are no restrictions on the 
system's language constructions handled or on the specific abstract domains that could 
be used. Furthermore, our framework facilitates the expression of big step semantics 
(on the logical side) and therefore avoids the loss of precision obtained when applying 
abstract transfer functions at a small step semantics level. 

To our knowledge, our initial implementation of the framework is the only avail- 
able tool based on abstract interpretation and Kleene-style fix point computation that 
provides invariants before the post fix point is reached. In a multi-analyzer setting, 
the possibility to share invariants before the end of the computation can drastically in- 
crease performance. But that sort of intermediate but guaranteed information could be 
extremely valuable even in a standalone use. For example, when analyzing a 200k-loc 
critical embedded software for the absence of run time errors [6], one could observe 
during the computation the sections of the code that are already proven (e.g., no divi- 
sion by zero at a certain statement) or (false) alarms. This contrasts with the current 
general practice where one has to wait, possibly for hours, for the fix point computation 
to end before interpreting the results, and seeing perhaps that certain parameters need 
further tuning. 

We have implemented our method and applied it to Lustre programs in the context 
of a larger project on the analysis of synchronous systems. Further work will involve a 
more extensive experimental evaluation of the method to assess its benefits on a larger 
scale. 
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